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ABSTRACT 

We identify 42 "candidate groups" lying between 1.8 < z < 3.0 from a sample of 3502 galaxies 
with spectroscopic redshifts in the zCOSMOS-deep redshift survey within this same redshift 
interval. These systems contain three to five spectroscopic galaxies that lie within 500 kpc in 
projected distance (in physical space) and within 700 km/s in velocity. Based on extensive analysis 
of mock catalogues that have been generated from the Millennium simulation, we examine the 
likely nature of these systems at the time of observation, and what they will evolve into down to 
the present epoch. Although few of the "member" galaxies are likely to reside in the same halo 
at the epoch we observe them, 50% of the systems will eventually bring them all into the same 
halo, and almost all (93%) will have at least part of the member galaxies in the same halo by the 
present epoch. Most of the candidate groups can therefore be described as "proto-groups" . A 
crude estimate of the overdensities of these structures is also consistent with the idea that these 
systems are being seen at the start of the assembly process. We also examine present-day haloes 
and ask whether their progenitors would have been seen amongst our candidate groups. For 
present-day haloes between 10 14 — 10 15 M /h, 35% should have appeared amongst our candidate 
groups, and this would have risen to 70% if our survey had been fully-sampled, so we can conclude 
that our sample can be taken as representative of a large fraction of such systems. There is a 
clear excess of massive galaxies above 10 10 M Q around the locations of the candidate groups in a 
large independent COSMOS photo- z sample, but we see no evidence in this latter data for any 
color differentiation with respect to the field. This is however consistent with the idea that such 
differentiation arises in satellite galaxies, as indicated at z < 1, if the candidate groups are indeed 
only starting to be assembled. 

Subject headings: catalogs, Galaxies: high-redshift, Galaxies: groups: general 
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1. Introduction 

Groups of galaxies, by which we mean sets of 
galaxies that occupy the same dark matter halo, 
are important for several reasons. They consti- 
tute the largest virialized systems in the universe 
and are therefore probes for the growth of struc- 
ture and eventually the underlying cosmological 
model. Furthermore, groups provide an environ- 
ment different from the field. The group environ- 
ment is suspected of influencing the evolution and 
properties of the member galaxies through various 
processes as ram pressure stripping (Gunn & Gott 
1972, Dressier 1980, Abadi et al. 1999), strangu- 
lation (Larson et al. 1980, Kawata & Mulchaey 
2008), enhanced merger rate (Spitzer & Baade 
1951), galaxy harassment (Moore et al. 1996) 
and so on. Recent work at low redshift (Peng et 
al. 2010, 2012) has indicated that the dominant 
process producing environmental differentiation in 
the galaxy population at low redshift (at least 
as regards the fraction of galaxies in which star- 
formation has been "quenched") is arising from 
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changes to satellite galaxies and there is evidence 
that this is true also out to z <~ 1 (Knobel et 
al. 2012, Kovac et al. 2012). Various papers 
have established the influence of the group envi- 
ronment on the galaxy population by investigat- 
ing the morphology-density relation (Oemler 1974, 
Balogh et al. 2004) or the differences between cen- 
trals and satellites (Peng et al. 2012, Pasquali et 
al. 2010, Skibba 2009). 

Identifying groups using discrete galaxies as a 
tracer sample is a non-trivial task. Previous work 
at low and intermediate redshift discusses exten- 
sively the performance of different group finders, 
in terms of the underlying dark matter haloes. 
Common automated group finding methods are 
the friends-of-friends method (Huchra & Gcllcr 
1982, Eke et al. 2004, Berlind et al. 2006), the 
Voronoy-Delaunay method (Marinoni et al. 2002, 
Gcrke et al. 2005, Cucciati et al. 2010) or a com- 
bination of both (Knobel et al. 2009, 2012). 

Little is known about groups at z > 1, mostly 
because few redshift surveys have penetrated be- 
yond this depth with a high enough sampling den- 
sity to have any hope of finding any except the 
most massive. The redshift interval around z <~ 2 
is of interest for several reasons. This is when the 
first groups consisting of multiple massive (around 
M*) galaxies should appear in the Universe in sig- 
nificant numbers. It is also close to the peak of 
star- format ion and AGN activity in the Universe, 
and where we might expect the first effects of the 
environment in controlling galaxy evolution to be- 
come apparent. 

Above a redshift of z <~ 2 there exist only rare 
examples of single clusters or groups in the lit- 
erature. The search for them relies on overden- 
sities around radio galaxies (Miley et al. 2006, 
Venemans et al. 2007), the search for X-ray emis- 
sion (Gobat et al. 2011) as well as overdensities 
identified with photometric redshifts (Spitler et al. 
2012, Capak et al. 2011). Some of these high 
redshift clusters have been confirmed spectroscop- 
ically later (Papovich et al. 2010, Steidel et al. 
2005, Tanaka et al. 2010 and Gobat et al. 2011). 

However, so far there has been no system- 
atic analysis of high redshift groups in spectro- 
scopic redshift surveys. As described below, the 
zCOSMOS-deep survey provides a large sample of 
galaxies at z > 1 including 3502 galaxies with us- 
able redshifts in the redshift interval 1.8 < z < 3 
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in a single fairly densely sampled region of sky, 
allowing the application of the same sort of algo- 
rithm as has been used to identify groups at z < 1. 

The aim of this paper is to identify possible 
groups at 1.8 < z < 3, based on a simple link- 
ing length algorithm. We provide a catalogue of 
42 such associations. In order to understand the 
physical nature of these detected structures, we 
have carried out extensive comparisons with mock 
catalogues that have been generated by Kitzbich- 
ler & White (2007) and then passed through the 
same "group-finding" algorithms. The primary 
aim is to assess whether the galaxies in these struc- 
tures are indeed already occupying the same dark 
matter halo. We can however also use the mocks 
to follow the future fate of each galaxy and thus 
to see when, if ever, the candidate member galax- 
ies will be in the same halo, whether they will 
merge with other galaxies and so on, and what 
the structures identified at high rcdshift are likely 
to become by the present epoch. 

This paper is organized as follows: We first de- 
scribe the zCOSMOS-deep sample and the mock 
catalogues used to calibrate and analyze our group 
catalogue. In section 3 we develop our group- 
finder algorithm on the basis of comparisons with 
the mocks, and produce the catalogue of 42 asso- 
ciations. In Section 4 we carry out an extensive 
analysis of the mocks to see what they indicate 
for (a) the nature of the structures that we detect 
at z > 2, (b) how they develop over time, down 
to z ~ 0, and how representative they are of the 
population of progenitors of massive haloes today. 
In Section 5 we examine a complementary photo- z 
sample and identify a significant excess of massive 
galaxies in the regions of the groups, but do not 
find evidence for any color differentiation of the 
population relative to the field, although we argue 
we should probably not have expected to see such 
differentiation. We then conclude the paper and 
summarize our findings. 

Where needed we adopt the following cosmolog- 
ical parameters (consistent with the Millennium 
simulation): Cl m ~ 0.25, Ha = 0.75 and Ho = 
73kms~ 1 Mpc^ 1 . All magnitudes are quoted in 
the AB system. 



2. Data 

2.1. The zCOSMOS-deep sample 

The zCOSMOS-deep redshift survey (Lilly et 
al. 2007, Lilly et al. 2012 in prep.) has observed 
around lO'OOO galaxies in the central ~ldeg 2 of 
the COSMOS field. The selection of the tar- 
gets for zCOSMOS-deep was quite complicated. 
All objects were color-selected to preferentially 
lie at high redshifts, through (mostly) a BzK 
color selection (c.f. Daddi et al. 2004) with a 
nominal Kab cut at 23.5, supplemented by the 
purely ultraviolet ugr selection (c.f. Steidel et al. 
2004) . An additional blue magnitude selection was 
adopted that for most objects was -Bab < 25.25. 
These selection criteria yield a set of star-forming 
galaxies which lie mostly in the redshift range 
1.3 < z < 3 (Lilly et al. 2007). The targeted 
sources were then observed with the VIMOS spec- 
trograph at the VLT using the low resolution LR- 
Blue grism giving a spectral resolution of R = 180 
over a spectral range of 3700 — 6700 A. The spatial 
sampling of zCOSMOS-deep is such that a central 
region of 0.6° x 0.62° was covered at approximately 
67% sampling, with a lower sampled outer region 
extending out to 0.92° x 0.91°. Both regions are 
centered on 10 00 43 (RA) , 02 10 23 (DEC) . 
In total 9523 galaxies have been observed. It 
was possible to assign a spectroscopic redshift to 
7773 of them. Repeat observations, including 
some with the higher resolution FORS-2 spectro- 
graph indicate a typical velocity error of around 
300km/s in the redshifts. 

To account for the varying reliability of the as- 
signed spectroscopic redshifts, confidence classes 
have been introduced as described in detail in Lilly 
et al. (2009, 2012 in prep.). Objects with flags 3 
and 4 have very secure redshifts, whereas objects 
with flags 1 and 2 have less secure redshifts. Flag 
9 indicates a single narrow emission line. An addi- 
tional decimal place is used to indicate the agree- 
ment with the photometric redshift, putting 0.5 if 
kphot — Zspcd < 0.1(1 + 2:), which is approximately 
three standard deviations of the scatter between 
photometric and spectroscopic redshifts. 

In this paper we only use galaxies with flags 

3, 4, 1.5, 2.5 and 9.5 meaning that the corre- 
sponding redshifts are either secure on their own 
or confirmed by the respective photometric red- 
shifts. Furthermore, we restrict our analysis to the 
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redshift range 1.8 < z < 3 where the success rate 
in measuring secure redshifts is highest because of 
the entrance of strong ultraviolet absorption fea- 
tures into the spectral range. The final sample 
used in this paper consists of 3502 objects from 
the catalogue in Lilly et al. (2012, in prep.). In 
the central 0.36 deg 2 region the overall sampling 
rate of this sample relative to the target catalogue 
is about 55%. We have a comoving number den- 
sity of 6.1 x 10~ 4 Mpc~ 3 . 

2.2. Mock catalogues 

2.2.1. The Millennium Simulation 

The Millennium Simulation is a large dark mat- 
ter TV-body simulation carried out in a cubic box 
of 500 h -1 Mpc sidelength. It starts from a glass- 
like distribution of particles that is perturbed by 
a gaussian random field and it follows the evo- 
lution of dark matter particles from z = 127 to 
z = 0. The results are stored in 64 snapshots, 
placed logarithmically in redshift space and start- 
ing from z — 20. From these dark matter particles 
merger-trees are built up through the identifica- 
tion of gravitationally bound haloes which in post- 
processing are populated with galaxies (Springel 
et al. 2005, Lemson & Springel 2006). Several 
semi-analytic models for the galaxy formation pro- 
cess have been implemented on top of the dark 
matter structure of the Millennium simulation. 
The Kitzbichler & White (2007) mocks used in 
this work are based on a galaxy formation semi- 
analytic model (SAM) as described in deLucia & 
Blaizot (2007). 

The structure and presentation of the Millen- 
nium simulation allows us to follow both haloes 
and individual galaxies through time and therefore 
to determine the subsequent evolution of group- 
like structures that are identified at a particular 
redshift (Lemson et al. 2006). It is therefore ideal 
for the present purposes of trying to understand 
the physical nature of corresponding objects in the 
sky, provided of course that the simulation, and 
the associated galaxy formation model, are not 
grossly inconsistent with the real Universe. 

In this work we make extensive use of the six 



independent Kitzbichler & White (20071 mock 
lightcones which provide "observations" of a 1.4° x 
1.4° field and in which the identities of the galaxies 
are linked to the Millennium Simulation. These 



light cones are constructed with an observer at 
redshift z = using a periodic extension of the 
simulation box to cover high redshifts (Blaizot et 
al. 2005). This will inevitably lead to the even- 
tual double appearance of objects. However, for 
the field size of 1.4° x 1.4° the first duplicate will 
appear around z ~ 5, which is beyond the red- 
shift range we are interested in. Each light cone is 
based on a different observer and a different direc- 
tion and therefore can be regarded as independent 
in terms of large scale structure at high redshifts. 

2.2.2. Sample selection 

For the mock catalogues to resemble the 
zCOSMOS-deep sample we first add a straightfor- 
ward observational velocity error to each galaxy 
by adding a velocity selected randomly from a 
gaussian distribution with a v — 300km/s. The 
main concern is to match the number densities 
of galaxies in the actual zCOSMOS sample and 
in the mocks. Starting with the set of all galax- 
ies in the mocks, we applied limiting magnitudes 
in B and K. Small adjustments to the nominal 
Bab < 25.25 and A'ab < 23.5 limits were then 
made above and below z ~ 2 so as to match 
as well as possible the shape of the N(z) num- 
ber counts of objects in the actual data, i.e., so 
that s = S moc k s S z (A data (z) - N mocks (z)) 2 was 
minimized. Given the overall sampling (spa- 
tial sampling times spectroscopic success rate) 
of zCOSMOS-deep in this redshift range, we con- 
structed, through these small magnitude adjust- 
ments, a mock sample that had exactly twice the 
surface number density as the final spectroscopic 
sample in the highly sampled central region. This 
meant a final division of the mock sample into 
two via random sampling could be used to simu- 
late the ~50% sampling of the spectroscopic data 
and yield a second, complementary, mock sample 
from the same light cone. This is useful to see 
the effects of the sampling as well as doubling the 
number of mock samples. 

It should be emphasized that the goal of this 
exercise was to produce a mock sample that had 
the correct N(z) and was similarly dominated by 
star-forming galaxies, rather than to simulate ex- 
actly the selection of the objects. Such an exact 
simulation would have depended on the details 
of the galaxy formation prescription used in the 
SAM prescription, and on the uncertain vagaries 
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Fig. 1. — The average iV(z)-distribution of the ob- 
jects in the final mock catalogues (red) after adjust- 
ment, as compared to the TV (^-distribution of the ac- 
tual zCOSMOS-deep sample (blue). The shaded area 
shows the spread of the mocks (in terms of their stan- 
dard deviation). An adjustable magnitude cut in B 
and K was applied to the mocks in order to match the 
number density of galaxies to the data (see text). 

of the zCOSMOS-deep spectroscopic success rate 
etc. Figure [T] shows the resulting N{z) averaged 
over all twelve mock samples, compared with that 
of the zCOSMOS sample. 

3. Methods 

3.1. Group definition 

Throughout this work we will use the following 
terminology: 

1. "(real) group": a set of three or more galax- 
ies which are all in the same dark matter 
halo at the epoch in question; 

2. "partial group" : a set of three or more galax- 
ies at least two of which are in the same dark 
matter halo at the epoch in question; 

3. "candidate group" : a set of three or more 
galaxies that are identified by the group- 
finder as defined in the next section; 

4. "proto-group" : a candidate group in which 
all the members will be found in a real group 
at some later epoch; 




600 800 1000 1200 1400 

linking length Av [km/s] 

Fig. 2. — Number of proto-groups in the mock cat- 
alogues (upper panel), the total number of candi- 
date groups (middle) and the fraction of proto-groups 
(lower panel) as a function of the velocity linking 
length At; for various projected linking lengths Ar. 
The shaded areas show the spread in the mocks in 
terms of their standard deviation. The number of 
proto-groups stays largely constant after the first rise 
up to Aw ~ 700 km/s, whereas the total number of 
candidate groups keeps rising with increasing Ar and 
Av, producing a declining fraction of proto-groups. 
Requiring the velocity linking length to fulfill Av > 
700 km/s, the choice of 500 kpc for the projected link- 
ing length (shown in green) keeps the fraction of proto- 
groups above 50% (see text for details). The mid- 
dle panel also shows the actual number of candidate 
groups found in zCOSMOS-deep with this parameter 
choice (black cross). This is in good agreement with 
the number of candidate groups defined in the same 
way in the mock catalogues. 
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5. "partial proto-group" : a candidate group 
which will become a partial group at a later 
epoch, i.e., in which some apparent members 
at the epoch in question will never appear in 
the same halo down to z — 0; 

6. "spurious group" : a candidate group in 
which none of the apparent members will 
ever belong to the same halo down to z = 0, 
i.e., the galaxies are simply projected on the 
sky. 

3.2. The nature of groups in the mocks 

The Kitzbichler light cones provide the galaxies 
together with a link to the actual object within 
the Millennium simulation. There an identifica- 
tion number (FOF-ID) is provided which gives the 
parent dark matter halo in which the galaxy is re- 
siding. These dark matter haloes are identified 
within the Millennium simulation using a friends- 
of-friends (FOF) algorithm applied to the dark 
matter particles. Galaxies belonging to the same 
group therefore have the same FOF-ID (Lcmson 
et al. 2006). This makes it straightforward to de- 
termine the group nature (as defined above) of a 
particular set of galaxies that has been detected 
by application of the group-finder algorithm to a 
mock catalogue simulating an observational light 
cone. The galaxies in a proto-group will not share 
the same FOF-ID until the galaxies have entered 
the common halo. 

Likewise, the descendant tree of galaxies that 
is provided by the Millennium simulation can be 
used to follow the evolution of single galaxies from 
z ~ 2 to z — and thereby to identify mergers be- 
tween galaxies. When two galaxies have the same 
descendant at the next snapshot, they must have 
merged in the intervening time. 

Using the mocks and the descendant trees of 
galaxies we were therefore able to identify, in the 
mocks, which candidate groups are already real 
or partial groups, which are not yet real/partial 
but will become so at some point in the future, 
and which are totally spurious in that the galaxies 
will never reside in the same halo. We can also see 
which galaxies merge together, which by definition 
requires them to be in the same halo. 
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Fig. 3. — The location of candidate groups in the 
COSMOS field. The candidate members are shown in 
red. The underlying zCOSMOS-deep sample in the 
same redshift range is shown in blue. The red square 
shows the extent of the central, highly sampled, area. 
Not surprisingly, the detection of structure is sensitive 
to the projected density of the available tracers. 

3.3. Group finder algorithm 

There is an extensive literature on finding 
groups in spectroscopic redshift surveys, based 
on a friends-of-friends approach (Huchra & Geller 
1982, Eke et al. 2004, Berlind et al. 2006), the 
Voronoi Delaunay method (Marinoni et al. 2002, 
Gerke et al. 2005, Cucciati et al. 2010), or a com- 
bination of both (Knobel et al. 2009 and 2012). 
At lower redshifts, where the emphasis is on real 
groups in the same halo, the group finder should 
ideally only pick out real groups, minimizing the 
number of interlopers. A major concern is the 
over-merging or fragmentation of groups and a 
great deal of effort goes into controlling these is- 



sues (see Knobel et al. (2012) for an extensive 



discussion). Many group-finders use a friends-of- 
friends method to link galaxies into structures. In 
choosing the linking lengths Ar (in physical space) 
and Av one has to take into consideration the fol- 
lowing, sometimes contradicting, requirements: 

• The linking length has to be large enough 
to ideally encompass all groups that are 
present, but small enough for not to over- 
merge groups, i.e., miss-detect two distinct 
groups as one. 



6 



800 



600 



^400 



200 



richness 
5 7 9 



group redshift 
2 2.2 2.4 2.6 2.8 3 




2.2 



2.4 



2.6 



2.8 



Fig. 4. — The iV(z)-distribution of the galaxies in the 
actual zCOSMOS-deep candidate groups (blue) com- 
pared to the distribution of the whole sample (grey), 
normalized to the same number of galaxies. 

• Interlopers (i.e., miss-identified group galax- 
ies) should be avoided. 

• The linking lengths must take into account 
the measurement errors as well as peculiar 
velocities. 

The choice of values for the linking lengths is 
therefore a compromise. We explored the per- 
formance of the group-finder with varying link- 
ing lengths with the mock catalogues, determin- 
ing for each resulting group catalogue the total 
number of candidate groups, the total number 
of real and/or proto-groups, and the fraction of 
real/proto-groups (see Figure [2|. It turns out 
that the number of real/proto-groups stays largely 
constant with increasing velocity linking length 
beyond ~ 700km/s, but increases with linking 
length Ar. The total number of candidate groups 
however increases steadily with both Av and Ar, 
meaning that the fraction of real/proto-groups de- 
creases with Av and with Ar. We set a frac- 
tion of real/proto-groups of 50% as a minimum 
requirement. Because of the initial upturn in 
the number of real/proto-groups we also want to 
have Av > 700km/s. It then turns out that the 
maximal linking length Ar (physical space) that 
fulfills these two requirements is 500 kpc. The 
Ar = 500 kpc and Av — 700km/s are slightly 
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Fig. 5. — Comparison of the basic properties of the 
candidate groups in the mock sample (red) with those 
in zCOSMOS-deep (blue). The shaded areas show the 
spread in the mock samples in terms of their standard 
deviation. Top left: Richness (number of candidate 
member galaxies). Top right: Redshift of the candi- 
date group. Bottom left: Root-mean-square radius of 
the candidate group, (r rms ) defined as the r.m.s. dis- 
tance of the members to their mean RA and DEC. 
Bottom right: R.m.s. of the velocity (u rm8 ) relative 
to the center of the candidate group defined by the 
mean redshift of the members. In general there is a 
good agreement between mocks and data, in particu- 
lar when taking into consideration the low number of 
candidate groups in the data. 



higher values than for instance in |Knobel et al.| 
[] p009| ), who uses 300-400 kpc and ~ 400 km/s. 
This is, however, justified by the larger measure- 
ment errors at our higher redshifts. 

3.4. Application to zCOSMOS sample and 
comparison with mocks 

Having determined the parameters of the FOF 
algorithm in the previous section, we apply the 
group-finder to the actual zCOSMOS data and 
the 12 mock samples. In the data this results in 
42 candidate groups with memberships of three or 
more, i.e., we do not consider "pairs". Of these 
42, one has five members and six have four, so 
the vast majority are triplets. The 42 candidate 
groups are listed in Table 1, their redshift distri- 
bution as compared to the parent sample is shown 
in Figure |4j Almost all of the detected candidate 
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2.232 


229 


140 


3 


7 


426726 


150.397 


2.000 


2.707 


287 


143 


3 


19 


429340 


149.993 


2.206 


2.554 


279 


147 


3 


39 


429401 


150.036 


2.205 


2.096 


324 


206 


3 


42 


434564 


149.870 


2.343 


2.678 


319 


222 


3 


5 


426418 


150.214 


1.964 


2.117 


269 


227 


3 


17 


429152 


149.933 


2.199 


2.279 


261 


239 


3 


26 


411468 


150.249 


2.333 


2.469 


297 


239 


3 


13 


407675 


150.194 


2.118 


2.178 


385 


251 


4 


32 


434605 


150.452 


2.396 


2.286 


110 


254 


3 


36 


413529 


150.102 


2.456 


2.476 


294 


264 


3 


28 


411517 


150.338 


2.344 


1.805 


224 


281 


3 


40 


429794 


150.098 


2.232 


2.099 


302 


284 


3 


35 


413241 


150.186 


2.436 


2.051 


260 


296 


3 


41 


434071 


150.332 


1.892 


2.957 


257 


304 


4 


34 


431678 


150.461 


2.427 


2.322 


169 


316 


3 


2 


402591 


150.329 


1.841 


2.096 


351 


322 


3 


11 


427339 


150.272 


2.050 


2.306 


214 


328 


3 


12 


406198 


150.588 


2.055 


2.029 


369 


340 


3 


10 


490746 


149.921 


2.028 


2.050 


459 


365 


4 


29 


431233 


150.452 


2.356 


2.278 


282 


381 


3 


1 


424327 


150.327 


1.766 


2.538 


229 


386 


3 


14 


428112 


150.359 


2.118 


2.232 


126 


405 


3 


3 


425554 


149.900 


1.883 


2.215 


190 


415 


3 


4 


425598 


150.218 


1.892 


2.684 


217 


435 


3 


27 


430794 


150.008 


2.325 


2.258 


275 


474 


4 


37 


413838 


150.028 


2.479 


2.452 


146 


476 


3 


33 


413105 


150.060 


2.423 


2.469 


335 


488 


3 


38 


433521 


150.153 


2.603 


2.282 


281 


496 


3 


8 


426762 


150.449 


2.010 


2.013 


293 


505 


4 


15 


428229 


150.517 


2.121 


2.153 


102 


507 


3 


18 


420527 


150.354 


2.206 


1.808 


188 


513 


3 


22 


430097 


150.000 


2.256 


2.440 


412 


526 


5 


31 


431338 


149.928 


2.384 


2.143 


113 


534 


4 


24 


410797 


150.056 


2.305 


1.974 


237 


545 


3 



Table 1: Candidate groups detected in zCOSMOS-deep, ordered by their velocity dispersion v. 
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groups are in the central more highly sampled 
region of the field, as shown in Figure [3| 

For each zCOSMOS candidate group, and for 
the corresponding candidate groups in the mocks, 
we also compute a nominal r.m.s. size and ve- 
locity dispersion by r rms = r 1 ~~ 1) an d 
v rms — \/Y^i v f /(N — 1), where and Vi denote 
the distance or the velocity of a galaxy to the cen- 
ter of the candidate group and N is the number 
of members. 

The center of the candidate group is defined by 
the average RA, DEC and z. The overall number 
of candidate groups found in the central area of 
zCOSMOS-deep (36 groups) agrees quite well with 
the average number found in the mocks, which 
is 44 per 0.36 deg 2 , i.e., the actual data has 18% 
fewer candidate groups. As shown in Figure [5] 
there is also broad agreement in the distributions 
in redshift, richness, and in the nominal size r rms 
and velocity dispersion u rms distributions. 

4. Results 

4.1. Are we detecting real groups at z > 2? 

We find that only 5 (out of in total 2791), i.e., 
less than 0.2%, of the candidate groups in the 
mocks are real groups in the sense that all of the 
members are already in the same dark matter halo 
at the time of observation. However, 8% of the ob- 
served structures are partially assembled with two 
galaxies in the same halo, meaning that we are 
observing groups with interlopers. 

The Millennium simulation used WMAP1 cos- 
mological parameters (with a erg = 0.9), whereas 
the most recent cosmological data establish a lower 
value for eg, implying a lower build-up of structure 
at a given redshift. As would be expected, the 



mock catalogues described in Wang et al. ( 2008 1, 
where <7g = 0.81 using the WMAP3 parameters 
(which are close to the most recent estimates) also 
yield essentially no real groups amongst the can- 
didate groups. 

4.2. Assembly timescale 

We established above, based on comparisons 
with the mocks, that the detected structures have 
not yet assembled when we observe them. In 8% 
of the candidate groups, two of the galaxies are 
already in the same halo, but essentially no candi- 



date group has assembled all three members. It is 
therefore an interesting question to see when and 
if these actually become groups, i.e., if they are 
what we call "proto-groups" at z ~ 2. 

The Millennium simulation allows us to fol- 
low the evolution of the structures we detect at 
z ~ 2 down to z = 0, i.e., to see when, if ever, 
the structures detected in zCOSMOS will merge 
into a common halo. It turns out that at the 
present time only 7% of the detected candidate 
group galaxies are still completely outside of a 
common halo. 93% of the candidate groups will ei- 
ther fully (50±1%) or partially (43±1%) assemble 
by the present epoch. The main criterion that dis- 
tinguishes proto-groups from partial or spurious 
ones is the velocity dispersion v rms . This is shown 
in Figure [6j In the regime v rms < 300km/s (which 
is comparable to the velocity error in the data) 
the fraction of proto-groups is above 50%, whereas 
it drops below 50% for velocity dispersions larger 
than 300km/s. The fraction of proto-groups does 
not depend on the projected radial size of the 
group. The trend with velocity dispersion is, how- 
ever, weak enough that it is not attractive to reject 
all candidate groups with u rms > 300km/s. 

As stated above, 93% of the candidate groups 
become real or partial groups by the present 
epoch. Already by z ~ 1.5, 50% of the candi- 
date groups are partial groups (up from 8% at the 
epoch of observation, see [8]) and by the current 
epoch, 50% of the candidate groups at z ~ 2 are 
real groups with all detected members within the 
same halo. The majority of the proto-groups start 
to assemble within a Aa < 0.1 (see Figure [7j "a" 
being the cosmic scale factor), which means that 
on a rather short timescale two or more members 
will share the same FOF-halo. The full assem- 
bly then requires a substantially larger timescale 
(Aa ~ 0.5 or even more). 

This continuous assembly process is further il- 
lustrated in Figure [8] and emphasizes that assem- 
bly is taking place even within the observational 
"window". Although only 8% of the candidate 
groups are partially assembled by the time we ob- 
serve them, by the end of the observing window 
at z = 1.8 around 25% of the proto-groups have 
already members in the same dark matter halo. 
These are therefore groups of richness 2 "contam- 
inated" by an interloper (most of which obviously 
later on will accrete onto the group) . We are there- 
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Fig. 6. — The fraction of proto-groups in the mocks 
as a function of their velocity dispersion u rms and size 
r lms . This fraction strongly depends on v TU1B whereas 
it is largely independent of r rms . For v m s ^ 300km/s 
the fraction of proto-groups is above 50%. The ob- 
served v ms is a crude indicator for the chance of a 
candidate group to become a real group in which all 
the galaxies share the same halo. The black circles 
show the location of the zCOSMOS-deep candidate 
groups. 
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Fig. 7. — The subsequent assembly of the proto- 
groups in the mocks. The diagrams show the change 
in a, the cosmic scale factor, before the proto-groups 
have accreted two (blue), and then all (black), of their 
identified members into the same halo (left panel for 
richness 3, right panel for richness > 4). Most of the 
proto-groups observed at 1.8 < z < 3.0 start to assem- 
ble within Aa < 0.1. 

fore able to actually observe the earliest phases of 
the assembly process of these groups. 

Figure [8] also illustrates the likelihood that 
group members seen as distinct galaxies at z <~ 2 
will have merged together by the current epoch. 



In about 40% of the proto-groups, two or more 
of the members that we identify at z ~ 2 will 
have merged together by the current epoch, and 
in about 10% all three members will have merged 
into a single massive galaxy. 




3 2.5 2 1.5 
redshift 

Fig. 8. — The assembly history of all the candidate 
groups with richness 3 (which constitute over ~85% 
of the sample) over redshift. Partially assembled sys- 
tems are shown in yellow (two members in the same 
dark matter halo) and fully assembled systems in blue 
(all members in the same halo). The light areas de- 
note member galaxies that have subsequently merged 
(by definition within the same halo). The grey zone 
represents candidate groups in which the members are 
not, at least yet, in the same halo. The white zone 
is because we only follow the evolution of a candidate 
group after it has been detected in the light cone and 
the diagonal grey-white border therefore reflects the 
redshift distribution of the detected candidate groups. 
At z = 1.8 already ~ 25% of the candidate groups (de- 
tected at slightly higher redshifts) have assembled at 
least two of their members into the same DM halo, up 
from 8% at the epoch of observation of the individual 
groups. 

4.3. Halo masses 

In the preceding discussion we followed the evo- 
lution of the structures that were detected by our 
group-finder at z ~ 2 down to the present epoch. 
In this section we look at haloes at the present 
epoch and ask which of their progenitors could 
have been detected at z > 1.8 in a zCOSMOS- 
like survey. To do this, we examine the set of all 
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present-day haloes in the simulation whose pro- 
genitors lie within the 1.8 < z < 3 volume of 
any of the six light cones. We first identify at 
the earlier epoch all of the haloes that will even- 
tually assemble into a given present-day halo, and 
then identify all the "progenitor galaxies" within 
these progenitor haloes and ask if they satisfy the 
zCOSMOS brightness selection criteria, without 
the 50% spatial sampling, referring to these as 
"zCOSMOS-selected" galaxies. We then addition- 
ally ask whether this set of "progenitor galaxies" 
would have satisfied our group-funding require- 
ments in terms of their spatial and velocity dis- 
placements, adding in also the incomplete spatial 
sampling of the zCOSMOS survey. 

The result is shown in Figure [9] Many haloes 
today, especially at M < 10 13 M Q /h, do not 
have any zCOSMOS-selected progenitor galaxies 
at 1.8 < z < 3. These are represented as the light 
grey region of the upper panel. Some have only 
one or two zCOSMOS-selected progenitor galax- 
ies and they are shown in dark grey, since they 
will by definition not be recognized as a "proto- 
group". The pink region represents haloes today 
whose progenitor haloes did contain three or more 
zCOSMOS-selected galaxies but which were, at 
1.8 < z < 3, too dispersed to satisfy our group- 
finding linking lengths. Finally, the blue region 
represents haloes with three or more progenitor 
galaxies that are close enough to be recognized as 
a candidate "group" . Applying the 50% sampling 
of the zCOSMOS-survey, about a half of these are 
actually recognized (light blue) , the remainder are 
missed simply because of the incomplete spatial 
sampling of the survey. 

At high present-day halo masses (above ~ 
lO 14 M0/h) the majority of the haloes are rep- 
resented in our candidate group catalogue in the 
sense of detecting three or more progenitor galax- 
ies and recognizing them as members of a candi- 
date group structure. In other words, around 65% 
of todays 10 14 - 10 15 M /h haloes should m prin- 
ciple have been recognized as a candidate group 
with the galaxy selection criteria of zCOSMOS, al- 
though a half of these will not have been detected 
in practice because of the random 50% sampling 
of our survey. Of the remaining 35% of present- 
day haloes above ~ 10 14 M Q /h that we would not 
have expected to be able to detect, more than a 
half have three or more detectable 
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Fig. 9. — Top panel: The average (over the 12 mock 
samples) fraction of present-day haloes that are de- 
tectable in a zCOSMOS-like survey at 1.8 < z < 3, 
as a function of the present-day dark matter mass of 
the halo. The light blue region shows haloes which to- 
day contain three or more galaxies that, at high red- 
shift, would satisfy the zCOSMOS-deep photometric 
selection criteria and would have been recognized as a 
candidate group with the zCOSMOS-deep overall sam- 
pling and success rates. The dark blue region repre- 
sents candidate groups that were not recognized sim- 
ply because of the incomplete sampling/success rate 
- the lack of these in our candidate group catalogue 
was therefore simply a matter of chance. The pink re- 
gion represents haloes in which the constituent galax- 
ies would have been observed in zCOSMOS-deep, but 
which were too dispersed, in projected distance or ve- 
locity, to satisfy our group-finding algorithm. The 
darker grey region represents present-day haloes which 
only had one or two progenitor galaxies satisfying the 
zCOSMOS-deep photometric criteria, while the light 
grey region represents haloes in which none of the 
progenitor galaxies could have been in zCOSMOS- 
deep. Around 65% of all present day 10 14 - 10 15 M /h 
groups would have a progenitor structure at z ~ 2 
which we would in principle be able to identify in 
zCOSMOS-deep with full sampling. Bottom Panel: 
As in the upper panel, but now the the total number 
of haloes is plotted instead of the fraction. 



progenitor galaxies, but these are too dispersed 
in space or velocity to satisfy our criteria. In- 
creasing the linking lengths to catch these dis- 
persed systems would, as shown above, however 
also severely increase the number of interlopers. 

The lower panel of Figure [9] shows the distrib- 
tion of the present-day halo masses of the sys- 
tems in our candidate group catalogue. While, as 
noted in the previous paragraphs, we are detect- 
ing a high fraction of the progenitors of the most 
massive haloes today, we are evidently detecting a 
broad range of present-day halo masses with most 
systems in the 10 13 — 10 14 M Q /h range. 

4.4. Overdensities 

4- 4-1. Determination of the overdensity 

In order to give a rough estimate for the over- 
densities 5 = pgT - p associated with the candidate 

p 

groups we calculated the mean (comoving) density 
p of the overall sample in bins of Az — 0.2 using 
the following equation: 

P = y > ^ = g ' area ' ('max — 'min)i 

where I denotes the comoving distance along the 
line-of-sight and area is the field of view of the 
mocks (1.4°xl.4°). 

The density of the groups p gr was determined 
by assuming a cylinder with radius r rms and a 
length of twice the v rms (in comoving units): 



where N is the number of members, I the length 
of the cylinder, and the factor 0.27 is included to 
account for the fact that in a 3D gaussian distribu- 
tion only this fraction of the points would actually 
lie within the ler region (which we assumed here, 
by setting the size of the cylinder to the r rms and 
the v rms ). 

The overdensity computed here is at best a 
rough order of magnitude estimate. First, it refers 
to the density within the r.m.s. radius containing 
only a fraction of the observed galaxies, leading to 
an over-estimate of the mean overdensities of all 
of the galaxies in the structure. An additional ef- 
fect comes from the 50% sampling rate. Adding in 
the missing galaxies does not add significant num- 
bers of new members to the detected associations 



(since they were the lucky ones with above average 
sampling), whereas the mean density of the field 
increases by a factor of two, leading to a factor of 
up to two over-estimate in the overdensity. On the 
other hand, due the effect of measurement errors 
in redshift (of order 300km/s) as well as peculiar 
velocities in that, the "size" along the line of sight 
may have been substantially over-estimated lead- 
ing to an underestimate of the actual overdensity, 
e.g., by almost an order of magnitude since the ob- 
served u rms corresponds to about 8Mpc (comov- 
ing) against the typical r rms of ~ 1 Mpc (comov- 
ing). The estimated over densities should there- 
fore be treated with considerable caution. 

4-4 Results 

With these caveats in mind, the distribution 
of S for the 42 candidate groups and for the cor- 
responding mock samples is shown in Figure |10| 
Even with the uncertainties outlined above, it is 
evident that that the candidate groups represent 
highly overdense regions and that most of them 
have probably already turned around (i.e., decou- 
pled from the background). This would be ex- 
pected if they are to merge into a single halo within 
an interval of expansion factor of Aa ~ a as dis- 
cussed above. 

4.5. Excess of high mass objects and red 
fractions 

So far we have established that the associations 
that we have found are in the main not yet fully 
formed groups, but are quite likely to become so 
by z — 0. Furthermore, the candidate groups are 
already quite overdense. For this reason it is of 
interest to look for surrounding overdensities and 
to look for any colour-differentiation of the galaxy 
population in and around the candidate groups 
relative to the field population. Unfortunately, 
zCOSMOS-deep itself is limited to star-forming 
galaxies by the colour selection, and so it is nec- 
essary to use photo-z objects from the larger and 
deeper COSMOS photometric sample (Capak et 
al. 2007). Typical photo-z errors are of order of 
Az - 0.03(1 + z) or 10'000km/s. 

We focus on relatively massive galaxies, above 
a stellar mass of > 10 10 M so that the photo-z 
errors are not excessive and so that the photo- 
z catalogue is complete in stellar mass. Most 
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Fig. 10. — The distribution of the group "overdensi- 
ties" in zCOSMOS-deep (red) and in the mocks (blue) . 
These overdensities are quite large and indicate that 
the structures are in an advanced stage of collapse, 
consistent with the idea that the galaxies will assem- 
ble into the same haloes in the future. However, read- 
ers should see the text for discussion and important 
caveats in the interpretation of this quantity. 

of these objects have 25 < Iab < 28. We first 
search for any excess of galaxies around the loca- 
tions of the candidate groups. We consider cylin- 
ders with radii that are a varying multiple of the 
group r rms and which have a fixed length of twice 
10'000kni/s. We lay down 42 cylinders, one over 
each group, and compare the total number of mas- 
sive (> 10 10 M ) galaxies in these cylinders to the 
totals found when the 42 cylinders are laid down 
at positions that have the same (z, r rms , dv) but 
random (RA, DEC) positions, repeating these ran- 
dom samples 1000 times and using the variation 
in the random samples to give an estimate of the 
noise to be expected in the group sample. 

Especially at small multiples of r rms a signifi- 
cant excess is seen around the candidate groups 
as shown in Figure [TT] At the position of the can- 
didate groups within a 1 — 2 r rms radius we find ~ 
40% more massive objects around the group posi- 
tions as in the general field, whereas this fraction 
drops for larger radii and is consistent with unity 
at ~ 10 r rms , which corresponds to ~ 3 Mpc (phys- 
ical). 

This excess is only slightly reduced when the 
spectroscopically observed objects are excluded 



Fig. 11— The excess of high mass (> 10 10 M o ) 
galaxies from the COSMOS photo-z sample around 
our spectroscopic candidate groups, relative to the 
field, as a function of the projected distance from the 
group in units of the r rma of the groups as seen in cylin- 
ders of depth Av = ±10'000km/s to accommodate 
photo- z errors (see text for details). At the position 
of the candidate groups we find a projected excess of 
up to ~40% in the number of massive galaxies (blue 
filled circles). This fraction reduces to ~25% if we 
subtract out the already known spectroscopic mem- 
bers (red open circles) and also reduces to insignifi- 
cance at large radii. This concentrated mean overden- 
sity suggests that our candidate groups indeed trace 
significant overdensities in the Universe. 



(red circles in Figure 11), and the excess seen in 
this independent dataset provides further evidence 
that the candidate groups catalogued in this paper 
are real physical associations and not just chance 
projections. 

Next we look at the distribution of colours in 
the photo-z sample around the candidate groups 
with respect to the field. For this we consider 
cylinders with a fixed radius of 2 r rms and the same 
length of twice 10'000km/s as above. We define 
red galaxies to be galaxies with Mjj — Mb > 0.7 
and consider a red fraction which is the number of 
red galaxies at a given stellar mass divided by the 
total number of galaxies at that mass. Figure [12] 
shows the red fractions as function of stellar mass. 

It is clear that the fraction of red objects in 
the candidate groups and in the field, at fixed 
stellar mass, is essentially the same and we do 
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Fig. 12. — The red fraction of objects in the photo-z 
sample at the position of our candidate groups (in red) 
as compared to the field (in blue) as a function of stel- 
lar mass. Red galaxies are defined to have rest-frame 
Mu — Mb > 0.7, using the spectral energy distribu- 
tions used to estimate their photometric redshifts. We 
find that there is no difference in the colours for the 
field and the candidate groups. This is, however, not 
surprising if the candidate groups are only starting to 
assemble and if environmental differentiation is con- 
fined to satellites, as indicated at lower redshifts. 



not see evidence of color segregation with environ- 
ment. Of course, given the large cylinder length 
in redshift (of order ±0.1 around the group loca- 
tion), our "group sample" will have been heavily 
contaminated by unrelated foreground and back- 
ground field galaxies: our overdensity of 40% sug- 
gests that also 70% of the photo- z "group sample" 
galaxies are projected from the field. These pro- 
jected galaxies will of course heavily dilute any 
intrinsic color difference and we could in princi- 
ple subtract these projected galaxies statistically. 
However, because the red fractions are so indistin- 
guishable, we have not attempted to do this. 

It is not clear that any such environmental seg- 
regation, at fixed stellar mass, should have been 
expected. We have argued above that the galaxies 
in the candidate groups are in general unlikely, at 
the epoch at which we observe them, to be shar- 
ing the same dark matter halo. A correspond- 
ingly small fraction of the galaxies will be satel- 
lites, even in the larger photo-z sample. Peng et 
al. (2012), amongst others, have presented clear 



evidence that all of the environmental differenti- 
ation of the galaxy population at low redshift is 
associated with the quenching of star- format ion 
in satellite galaxies, and there is now also good 
evidence that this remains true also at redshifts 
approaching unity (Kovac et al. 2012 in prep.). 

5. Summary & Conclusions 

We have applied a group-finder with link- 
ing lengths Ar = 500 kpc (physical) and Av = 
700km/s to the zCOSMOS-deep sample of 3502 
galaxies at 1.8 < z < 3.0, yielding 42 systems 
with three or more members. To try to under- 
stand what these associations likely are, and what 
they will probably become, we have constructed an 
analogous sample from 12 zCOSMOS-deep mock 
samples which were extracted from the Millen- 
nium simulation mock catalogues of Kit zbichler fc| 
White ( 2007 ), supplemented by a single light cone 



from the Wang et al. ( 2008 ) simulation which has 



a more realistic value of erg . 

We refer to the detected systems as "candidate 
groups" . We have introduced the following termi- 
nology in which a system in which all three de- 
tected members are in the same halo is called a 
"real group" and one in which only two are, a 
"partial group". Candidate groups that will be- 
come real or partial groups by z — are called 
"proto-groups" and "partial proto-groups" respec- 
tively. 

The number of candidate groups in the simu- 
lations agrees quite well with the number in the 
sky. However, analysis of the simulated candidate 
groups suggests that only a very small fraction, 



less than 0.2% in the Kitzbichler & White (2007) 



sample and none in the Wang et al. (2008) sam- 
ple, already have all the detected galaxies occu- 
pying the same halo at the time of observation, 
i.e., are already "real groups". About 8% of the 
candidate groups will however already have two 
members within the same halo in the IKitzbichlerl 



fc White | p007| sample. 

Furthermore, 50% of the mock candidate 
groups will have assembled all three galaxies 
into the same halo by z = (i.e., are "proto- 
groups" at the epoch of observation) and almost 
all (93%) will have at least two galaxies in the 
same halo. Only 7% are truly random associations 
whose members will never occupy the same halo. 
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The mocks suggest that the important param- 
eter that distinguishes the fate of the candidate 
group is the apparent velocity dispersion w rms . For 
D rms < 300km/s the fraction of system that will 
fully assemble all three members is above 50% and 
for larger dispersions it is lower. The fraction does 
not depend much on the projected angular size of 
the candidate groups. 

The observed candidate groups are being seen 
as they begin the assembly process. Already by 
z <~ 1.8 (which is the lower limit of our obser- 
vational window) around 25% of the candidate 
groups (observed at 1.8 < z < 3.0) will be partial- 
groups, the bulk of them doing so within Aa < 0.1 
from their epoch of detection, and within Aa < 0.5 
most proto-groups will have evolved into real or 
partial groups. 

If we look at today's groups and ask which 
of their progenitors will have been seen in our 
spectroscopic sample at z > 1.8, then we find 
that we should have detected ~ 35% of the pro- 
genitors of todays massive clusters (of order of 
10 14 - 10 15 M Q /h) already at z ~ 2 and this would 
rise to <~ 65% if we had 100% completeness in the 
zCOSMOS-deep spectroscopic sample. 

We can roughly estimate the overdensities of 
the spectroscopically detected structures and find 
that these are substantial, consistent with the idea 
that these systems will soon come together into 
assembled systems. 

We also detect a significant ovcrdensity in the 
regions of these candidate groups using indepen- 
dent the COSMOS photometric sample, which 
shows a 40% excess in the numbers of galaxies 
above 1O 1O M at the location of our spectroscopic 
candidate groups as compared to the field, despite 
the very large sampling cylinders (Az = ±0.1) re- 
quired from the use of photo-z. We do not however 
detect any significant differentiation in the colours 
of the galaxies compared to the field. However, we 
might not have expected to see such differences if 
most of the structures are still assembling on ac- 
count of the fact that at z < 1 environmental dif- 
ferentiation of the galaxy population is confined 
to satellite galaxies. 
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